AITopics | click-through rate prediction

Collaborating Authors

click-through rate prediction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network

Neural Information Processing SystemsFeb-16-2026, 02:31:08 GMT

Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(14 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction

Neural Information Processing SystemsDec-24-2025, 21:13:13 GMT

In many web applications, deep learning-based CTR prediction models (deep CTR models for short) are widely adopted. Traditional deep CTR models learn patterns in a static manner, i.e., the network parameters are the same across all the instances. However, such a manner can hardly characterize each of the instances which may have different underlying distributions. It actually limits the representation power of deep CTR models, leading to sub-optimal results. In this paper, we propose an efficient, effective, and universal module, named as Adaptive Parameter Generation network (APG), which can dynamically generate parameters for deep CTR models on-the-fly based on different instances. Extensive experimental evaluation results show that APG can be applied to a variety of deep CTR models and significantly improve their performance. Meanwhile, APG can reduce the time cost by 38.7\% and memory usage by 96.6\% compared to a regular deep CTR model.We have deployed APG in the industrial sponsored search system and achieved 3\% CTR gain and 1\% RPM gain respectively.

adaptive parameter generation network, click-through rate prediction, deep ctr model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Time Matters: A Novel Real-Time Long- and Short-term User Interest Model for Click-Through Rate Prediction

Gui, Xian-Jin

arXiv.org Artificial IntelligenceNov-11-2025

Abstract--Click-Through Rate (CTR) prediction is a core task in online personalization platform. A key step for CTR prediction is to learn accurate user representation to capture their interests. Generally, the interest expressed by a user is time-variant, i.e., a user activates different interests at different time. However, most previous CTR prediction methods overlook the correlation between the activated interest and the occurrence time, resulting in what they actually learn is the mixture of the interests expressed by the user at all time, rather than the real-time interest at the certain prediction time. T o capture the correlation between the activated interest and the occurrence time, in this paper we investigate users' interest evolution from the perspective of the whole time line and develop two regular patterns: periodic pattern and time-point pattern. Based on the two patterns, we propose a novel time-aware long-and short-term user interest modeling method to model users' dynamic interests at different time. Extensive experiments on public datasets as well as an industrial dataset verify the effectiveness of exploiting the two patterns and demonstrate the superiority of our proposed method compared with other state-of-the-art ones. Click-Through Rate (CTR) prediction plays an important role in today's online personalization platform (e.g., e-commerce, online advertising, recommender systems), whose goal is to accurately predict the probability of a user clicking a target item in certain context environments. Accurately modeling user interest is fundamental for CTR prediction task.

information, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.06213

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Macao (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)

Add feedback

9ab8da29b1eb3bec912a06e0879065cd-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 02:28:58 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(15 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

Equip Pre-ranking with Target Attention by Residual Quantization

Li, Yutong, Zhu, Yu, Qiao, Yichen, Guan, Ziyu, Shao, Lv, Liu, Tong, Zheng, Bo

arXiv.org Artificial IntelligenceSep-25-2025

The pre-ranking stage in industrial recommendation systems faces a fundamental conflict between efficiency and effectiveness. While powerful models like Target Attention (TA) excel at capturing complex feature interactions in the ranking stage, their high computational cost makes them infeasible for pre-ranking, which often relies on simplistic vector-product models. This disparity creates a significant performance bottleneck for the entire system. To bridge this gap, we propose TARQ, a novel pre-ranking framework. Inspired by generative models, TARQ's key innovation is to equip pre-ranking with an architecture approximate to TA by Residual Quantization. This allows us to bring the modeling power of TA into the latency-critical pre-ranking stage for the first time, establishing a new state-of-the-art trade-off between accuracy and efficiency. Extensive offline experiments and large-scale online A/B tests at Taobao demonstrate TARQ's significant improvements in ranking performance. Consequently, our model has been fully deployed in production, serving tens of millions of daily active users and yielding substantial business improvements.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.16931

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Diffusion-based Multi-modal Synergy Interest Network for Click-through Rate Prediction

Cui, Xiaoxi, Lu, Weihai, Tong, Yu, Li, Yiheng, Zhao, Zhejun

arXiv.org Artificial IntelligenceSep-1-2025

In click-through rate prediction, click-through rate prediction is used to model users' interests. However, most of the existing CTR prediction methods are mainly based on the ID modality. As a result, they are unable to comprehensively model users' multi-modal preferences. Therefore, it is necessary to introduce multi-modal CTR prediction. Although it seems appealing to directly apply the existing multi-modal fusion methods to click-through rate prediction models, these methods (1) fail to effectively disentangle commonalities and specificities across different modalities; (2) fail to consider the synergistic effects between modalities and model the complex interactions between modalities. To address the above issues, this paper proposes the Diffusion-based Multi-modal Synergy Interest Network (Diff-MSIN) framework for click-through prediction. This framework introduces three innovative modules: the Multi-modal Feature Enhancement (MFE) Module Synergistic Relationship Capture (SRC) Module, and the Feature Dynamic Adaptive Fusion (FDAF) Module. The MFE Module and SRC Module extract synergistic, common, and special information among different modalities. They effectively enhances the representation of the modalities, improving the overall quality of the fusion. To encourage distinctiveness among different features, we design a Knowledge Decoupling method. Additionally, the FDAF Module focuses on capturing user preferences and reducing fusion noise. To validate the effectiveness of the Diff-MSIN framework, we conducted extensive experiments using the Rec-Tmall and three Amazon datasets. The results demonstrate that our approach yields a significant improvement of at least 1.67% compared to the baseline, highlighting its potential for enhancing multi-modal recommendation systems. Our code is available at the following link: https://github.com/Cxx-0/Diff-MSIN.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3726302.3729949

2508.2146

Country:

Europe > Italy (0.05)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(2 more...)

Add feedback

Taming the One-Epoch Phenomenon in Online Recommendation System by Two-stage Contrastive ID Pre-training

Hsu, Yi-Ping, Wang, Po-Wei, Eksombatchai, Chantat, Xu, Jiajing

arXiv.org Artificial IntelligenceAug-27-2025

ID-based embeddings are widely used in web-scale online recommendation systems. However, their susceptibility to overfitting, particularly due to the long-tail nature of data distributions, often limits training to a single epoch, a phenomenon known as the "one-epoch problem." This challenge has driven research efforts to optimize performance within the first epoch by enhancing convergence speed or feature sparsity. In this study, we introduce a novel two-stage training strategy that incorporates a pre-training phase using a minimal model with contrastive loss, enabling broader data coverage for the embedding system. Our offline experiments demonstrate that multi-epoch training during the pre-training phase does not lead to overfitting, and the resulting embeddings improve online generalization when fine-tuned for more complex downstream recommendation tasks. We deployed the proposed system in live traffic at Pinterest, achieving significant site-wide engagement gains.

artificial intelligence, downstream model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3640457.3688053

2508.187

Country:

Europe > Italy > Apulia > Bari (0.06)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology (0.38)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Action is All You Need: Dual-Flow Generative Ranking Network for Recommendation

Guo, Hao, Xue, Erpeng, Huang, Lei, Wang, Shichao, Wang, Xiaolei, Wang, Lei, Wang, Jinpeng, Chen, Sheng

arXiv.org Artificial IntelligenceAug-19-2025

Deep Learning Recommendation Models (DLRMs) often rely on extensive manual feature engineering to improve accuracy and user experience, which increases system complexity and limits scalability of model performance with respect to computational resources. Recently, Meta introduced a generative ranking paradigm based on HSTU block that enables end-to-end learning from raw user behavior sequences and demonstrates scaling law on large datasets that can be regarded as the state-of-the-art (SOTA). However, splitting user behaviors into interleaved item and action information significantly increases the input sequence length, which adversely affects both training and inference efficiency. To address this issue, we propose the Dual-Flow Generative Ranking Network (DFGR), that employs a dual-flow mechanism to optimize interaction modeling, ensuring efficient training and inference through end-to-end token processing. DFGR duplicates the original user behavior sequence into a real flow and a fake flow based on the authenticity of the action information, and then defines a novel interaction method between the real flow and the fake flow within the QKV module of the self-attention mechanism. This design reduces computational overhead and improves both training efficiency and inference performance compared to Meta's HSTU-based model. Experiments on both open-source and real industrial datasets show that DFGR outperforms DLRM, which serves as the industrial online baseline with extensive feature engineering, as well as Meta's HSTU and other common recommendation models such as DIN, DCN, DIEN, and DeepFM. Furthermore, we investigate optimal parameter allocation strategies under computational constraints, establishing DFGR as an efficient and effective next-generation generative ranking paradigm.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.16752

Country: Asia > China (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Practice of Deep Hierarchical Ensemble Network for Ad Conversion Rate Prediction

Zhuang, Jinfeng, Li, Yinrui, Su, Runze, Xu, Ke, Shao, Zhixuan, Li, Kungang, Leng, Ling, Sun, Han, Qi, Meng, Meng, Yixiong, Tang, Yang, Liu, Zhifang, Shen, Qifei, Mudgal, Aayush, Lu, Caleb, Liu, Jie, Shen, Hongda

arXiv.org Machine LearningApr-23-2025

The predictions of click through rate (CTR) and conversion rate (CVR) play a crucial role in the success of ad-recommendation systems. A Deep Hierarchical Ensemble Network (DHEN) has been proposed to integrate multiple feature crossing modules and has achieved great success in CTR prediction. However, its performance for CVR prediction is unclear in the conversion ads setting, where an ad bids for the probability of a user's off-site actions on a third party website or app, including purchase, add to cart, sign up, etc. A few challenges in DHEN: 1) What feature-crossing modules (MLP, DCN, Transformer, to name a few) should be included in DHEN? 2) How deep and wide should DHEN be to achieve the best trade-off between efficiency and efficacy? 3) What hyper-parameters to choose in each feature-crossing module? Orthogonal to the model architecture, the input personalization features also significantly impact model performance with a high degree of freedom. In this paper, we attack this problem and present our contributions biased to the applied data science side, including: First, we propose a multitask learning framework with DHEN as the single backbone model architecture to predict all CVR tasks, with a detailed study on how to make DHEN work effectively in practice; Second, we build both on-site real-time user behavior sequences and off-site conversion event sequences for CVR prediction purposes, and conduct ablation study on its importance; Last but not least, we propose a self-supervised auxiliary loss to predict future actions in the input sequence, to help resolve the label sparseness issue in CVR prediction. Our method achieves state-of-the-art performance compared to previous single feature crossing modules with pre-trained user personalization features.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Machine Learning

2504.08169

Country:

North America > United States > California > San Francisco County > San Francisco (0.18)
Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Industry:

Marketing (1.00)
Information Technology > Services (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Addressing Cold-start Problem in Click-Through Rate Prediction via Supervised Diffusion Modeling

Zhu, Wenqiao, Wang, Lulu, Wu, Jun

arXiv.org Artificial IntelligenceApr-10-2025

Predicting Click-Through Rates is a crucial function within recommendation and advertising platforms, as the output of CTR prediction determines the order of items shown to users. The Embedding \& MLP paradigm has become a standard approach for industrial recommendation systems and has been widely deployed. However, this paradigm suffers from cold-start problems, where there is either no or only limited user action data available, leading to poorly learned ID embeddings. The cold-start problem hampers the performance of new items. To address this problem, we designed a novel diffusion model to generate a warmed-up embedding for new items. Specifically, we define a novel diffusion process between the ID embedding space and the side information space. In addition, we can derive a sub-sequence from the diffusion steps to expedite training, given that our diffusion model is non-Markovian. Our diffusion model is supervised by both the variational inference and binary cross-entropy objectives, enabling it to generate warmed-up embeddings for items in both the cold-start and warm-up phases. Additionally, we have conducted extensive experiments on three recommendation datasets. The results confirmed the effectiveness of our approach.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2504.0627

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback